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1.   Abstract 


Although  there  are  many  open  theoretical  questions  for  stiff  (and 
non-stiff)  variable  order,  variable  step  methods,  it  seems  unlikely  that 
there  will  be  any  major  computational  advances  as  far  as  integration  methods 
for  the  general  non-linear  problem.   The  major  drawback  of  many  of  the 
methods  that  have  been  proposed  in  the  last  few  years  is  that  they  involve 
too  many  matrix  operations,  or  operations  on  larger  matrices  than  in  some 
of  the  more  straightforward  techniques  such  as  the  backward  differentiation 
formulas.   However,  advances  in  techniques  for  handling  large,  sparse 
matrices  may  affect  the  choice  of  integration  method  for  very  large  problems, 
This  paper  examines  possible  improvements  that  might  occur  through  the  use 
of  very  inaccurate  Jacobians.   Such  Jacobians  might  be  chosen  so  that  the  LU 
decomposition  is  particularly  simple.   Some  methods  are  shown  to  remain 
stable  in  spite  of  very  large  errors  in  the  Jacobian.   These  results  are 
preliminary,  but  suggest  that  further  speed  increases  in  very  large  problems 
are  possible. 


2.   Introduction 

Although  a  large  number  of  A-stable  methods  are  known  (for  example, 
implicit  Runge-Kutta  -  Butcher  [3],  multiderivative  -  Enright  [6],  Genin  [9], 
Brown  [2],  extrapolation  -  Lindberg  [16],  and  other  clever  techniques  - 
Bickart  and  Picel  [l],  Liniger  and  Odeh  [17],  Rosenbrock  [22],  Skeel  and 
Kong  [24],  Sloate  and  Bickart  [25],  and  Watanabe  [26])  the  method  used  in 
the  majority  of  large  codes  is  based  on  the  backward  differentiation  formulas 
which  are  not  A-stable  for  order  greater  than  two,  and  which  do  not  have 
very  small  error  coefficients.   The  principal  reason  that  this  is  true,  apart 


from  the  existence  of  fairly  robust  automatic  software  incorporating  these 
methods  but  not  the  other  methods  (Hindmarsh  [10],  Hindmarsh  and  byrne  [ll], 
Krogh  [14],  Shampine  [23],  and  Gear  [8]),  is  that  most  other  methods  either 
require  exact  values  of  the  Jacobians  at  every  step  in  order  to  achieve 
their  accuracy  (e.g.  multiderivative  methods  and  t'le  Rosenbrock-like  methods), 
or  require  that  a  larger  system  of  non-linear  equations  be  solved  at  each 
step,  or  both.   While  neither  of  these  requirements  presents  a  particular 
problem  for  small  systems,  small  systems  do  not  consume  much  computer  time 
so  there  is  little  incentive  to  consider  other  methods  if  there  is  a 
reasonable  subroutine  in  the  library,  particularly  if  it  calls  for  no  more 
than  a  minimum  amount  of  programming  on  the  part  of  the  user.   However,  in 
large  systems,  such  as  those  that  arise  in  electrical  network  analysis, 
chemical  kinetic  problems  coupled  with  fluid  flow  considerations, 
lasing,  and  spatial  discretization  of  a  system  of  partial  differential 
equations  to  get  a  system  of  ordinary  differential  equations,  significant 
amounts  of  time  are  spent  in  solving  non-linear  equations  by  repeated 
solution-  of  linear  problems  in  a  quasi-Newton  iteration.   The  figure  of 
30%  of  the  total  solution  time  has  been  observed  for  systems  of  less  than 
a  thousand  ODEs,  and  this  figure  increases  with  n.   Considerable  savings 
could  be  realized  if  better  ways  to  handle  the  non-linear  systems  could  be 
developed.   Further,  since,  for  problems  which  require  modest  accuracy  and 
for  which  function  evaluations  do  not  cos.t  much,  one-step  methods  such  as 
implicit  Runge-Kutta  methods  may  be  superior,  it  may  be  that  these  methods 
will  turn  out  to  be  superior  when  better  matrix  methods  are  developed. 
The  improvement  may  be  particularly  significant  in  problems  whose  solution 
displays  repeated  transient  regions  (in  which  the  solution  varies  rapidly) 
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separated  by  relatively  quiescent  periods  in  which  the  solution  is  slowly 
varying,  as  can  happen  in  lasing  problems.   The  reason  for  the  potential 
advantage  of  one-step  methods  is  that  in  variable  step  methods,  the  step 
gets  large  in  the  quiescent  regions,  and  then  a  sudden  transient  looks  , 
very  much  like  a  discontinuity  that  cannot  be  handled  by  current  multistep 
methods. 

In  this  paper  we  will  suggest  some  approaches  that  may  lead  to 
improved  techniques  for  handling  the  non-linear  systems  of  equations.   This 
is  a  very  preliminary  report  of  ongoing  work  (which  may  encounter  insuper- 
able difficulties  and  have  no  value)  but  it  is  hoped  that  the  report  will 
indicate  some  of  the  problems  and  possible  solutions. 

3.   Why  is  a  Jacobian  needed? 

Consider  the  differential  equation 

yf  =  a.[y  -  u(t)]  +  u'(t)  (1) 

whose  solution  is 

y  =  u(t)  +  c.exp(at) 
If  a  <<  0  in  the  time  scale  in  which  u(t)  is  slowly  varying,  the  problem 
is  stiff.   In  fact,  as  a  tends  to  minus  infinity,  the  solution  tends  to 
u(t)  with  a  boundary  layer  at  t  =  0  in  the  case  that  y(0)  #  u(0).   We  can 
also  write  (1)  as  the  singular  perturbation  problem 

ey'  -  eu*  =  y  -  u(t)  =  g(y,t)  (2) 

where  e  =  1/a.   We  see  that  for  very  small  e  we  are  really  faced  with  the 
solution  of  the  equation  g(y,t)  =  0  at  each  time  step.   In  the  case  of  a 
non-linear  differential  equation  we  observe  that  if  the  Jacobian  J  =  3f/3y 
is  very  "large"  (that  is,  has  very  negative  eigenvalues),  small  deviations 
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of  y  from  the  value  that  makes  f(y,t)  almost  zero  lead  to  very  large 
derivatives  which  will  cause  the  solution  to  be  returned  near  to  the 
solution  of  f(y,t)  =  0  very  rapidly  (or  the  problem  is  unstable).   Conse- 
quently whatever  we  do  must  amount  to  solving  the  non-linear  problem 
f(y,t)  =  0  approximately.   Many  methods  for  the  aporoximate  solution  of 
non-linear  equations  make  some  use  of  the  Jacobian.   The  following  simple 
predictor-corrector  scheme  for  the  problem  y1  =  f(y,t)  illustrates  the  use 

of  the  Jacobian.   Let  y   and  y'  be  approximations  to  y  and  its  derivative 

n      n  J 

at  t  =  t  .   A  single  step  consists  of 

Euler  predictor         p    =  y  +  1  y' 

n+1     n     n 


One-step  corrector      y    =  y  +  bhy'    +  (1  -  b)hy' 

n+1    n      n+1  n 


(3) 


If  b  :>  .5  this  is  good  for  stiff  problems.   Since  the  corrector  is  implicit 

(as  are  all  methods  that  have  good  stability  properties  for  stiff  problems) , 

we  must  decide  how  to  solve  for  y    .   If  we  have  an  approximation  A  to  the 

Jn+l  rK 

Jacobian  3f/3y,  we  can  use  one  step  of  a  Newton  method  starting  from  the 
predictor  (as  is  done  in  most  current  codes)  to  get 

^n+1  "  Pn+1  -  [I  "  bhAl"'  [P„+1  "  *n  "  !*«W  "  (1  -  h)<]  <4) 

(The  dependence  of  f  on  t  is  not  explicitly  indicated  where  unimportant.) 
If  f(y)  =  Ay,  this  solves  the  corrector  in  one  step.   If  A  is  reasonably 
close  to  the  Jacobian  J  of  the  system,  the  method  may  still  have  good 
stability  properties.   (See  Klopfenstein  and  Davis  [13]  who  consider  some 
PECE  methods  which  are  stable  for  large  values  of  J  provided  that  A  is 
close  to  J  in  a  sense  that  requires  the  difference  to  remain  of  order  1.) 
To  see  what  happens  when  the  Jacobian  of  a  system  is  very  large,  consider 
a  sequence  of  systems  with  Jacobian  c J .   Replace  J  with  cJ  in  (4)  and  take 
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the  limit  as  c  goes  to  infinity.   Noting  that  f(y,t)  will  have  to  be 
replaced  with  cf(y,t)  we  get 

y    ■  P    -  J~l   f(p    ,  t   )  (5) 

yn+i    n+i        vpn+i  '   n+i  K    J 

which  is  a  Newton  step  for  the  solution  of  f(y,t)  =  0.   This  is  no  surprise, 
but  it  has  been  introduced  to  motivate  consideration  of  some  other  tech- 
niques for  the  solution  of  linear  and  non-linear  equations.   First  let  us 
consider  ways  of  solving  the  system  g(x)  =0.   We  will  assume  that  all  of 
the  eigenvalues  of  g  are  strictly  in  the  right  half-plane.   If  g  is  linear 
and  there  is  some  knowledge  of  the  location  of  these  eigenvalues,  there  are 
a  number  of  iterative  methods  that  can  be  used.   The  recent  thesis  of 
Manteuffel  [18]  discusses  a  Chebyshev  type  of  iteration  that  attempts  to 
estimate  the  location  of  the  critical  eigenvalues,  and  references  much  of 
the  literature  in  the  general  area.   However,  let  us  start  with  the  simplest 
(and  usually  useless)  iterative  technique  possible,  functional  iteration. 
The  iteration 

x  ,   =  x  -  g(x  ) 
m+1    m   B  m 

only  converges  if  we  can  bound  the  norm  of  I  -  J  to  be  less  than  one,  where 

J  is  the  Jacobian  of  g  (that  is,  if  the  eigenvalues  of  J  are  strictly  inside 

the  unit  circle  centered  at  1)  .   Since  that  is  an  unreasonable  restriction^! 

we  look  for  a  circle  of  radius  R  centered  at  R  that  contains  all  of  the 

eigenvalues  of  J.   If  they  are  all  in  the  positive  half -plane,  such  clearly 

exists.   Then  we  can  use  the  iteration 

x    =  x  -  R"1  g(x  )  (6) 

m+i    m  m 

It  converges,  albeit  very  slowly.   We  can  view  equation  (6)  as  a 


quasi-Newton  method  where  R  is  an  approximation  to  the  Jacobian.   In  fact, 
the  rate  of  convergence  is  1  -  d/R  where  d  is  the  smallest  distance  from 
the  radius  R  circle  centered  at  R  to  any  eigenvalue  of  J.   However,  we  are 
going  to  examine  the  application  of  a  single  iteration  of  this  type  to  the 
corrector  of  a  differential  equation  because  we  hope  that  we  can  get 
accuracy  with  a  good  predictor,  and  use  the  corrector  only  to  achieve 
stability.   Later,  we  will  examine  other  iterative  techniques  that  make  use 
of  a  greater  knowledge  of  the  Jacobian  than  a  bound  on  the  eigenvalues. 

4.   Regions  of  Absolute  Stability 

The  region  of  absolute  stability  of  a  method  is  that  set  of  values 
of  ha  such  that  the  numerical  solution  of  y'  =  ay  decays  when  step  size  h 
is  used.   A  method  is  A-stable  if  the  region  of  absolute  stability  includes 
all  of  the  negative  half-plane.   Many  methods  are  not  A-stable,  but  usually 
if  the  eigenvalues  of  the  Jacobian  are  inside  the  region  of  absolute 
stability  we  will  get  reasonable  error  behavior.   (See  Odeh  and  Liniger  [19] 
and  Dahlquist  [4],  [5].)   The  region  of  absolute  stability  for  the  corrector 
in  (3)  is  as  shown  in  Figure  1.   If  b  i  .5  the  method  is  A-stable.   However, 
suppose  that  we  try  to  "solve"  the  corrector  with  a  single  modified 
functional  iteration  as  given  in  (4)  using  -R  as  an  "approximation"  to  A  in 
analogy  with  (6).   (Strictly  speaking,  we  must  use  -R  times  the  identity  as 
the  approximation.)   If  we  now  consider  the  behavior  of  the  linear  scalar 
test  equation  y'  =  ay  we  get 

y    =  [1  +  ha  +  [1  -  bhR]_1  b(ha)2]y 

or 


y .   =  [1  -  bhR]"   [1  +  h(a  -  bR)  +  bh2a(a  -  R)]y 
n+l  n 

2 
Because  h  a(a  -  R)  appears  in  the  numerator,  while  only  hR  appears  in  the 

denominator,  this  cannot  be  stable  for  other  than  a  small  region  of  the  ha 

plane  (of  order  1).   The  problem  occurs  because  the  calculation  of  y' 

introduces  a  factor  a.   This  happens  because  we  are  doing  the  equivalent  of 

a  PECE  method.   Suppose  instead  we  look  at  a  PEC  method.   If  y  and  d  are 

n      n 

the  numerical  approximations  to  y(t  )  and  hy'(t  ),  we  can  write  a  PEC 

n  n 


method  in  the  form 

Predictor 

Evaluation 

Corrector 


n+l     n    n 


d  4.,   =  "hRy  4.,   +  thRP  J.,   +  hf(P  4.,)] 

n+l       n+l       n+l       n+l 


y    =  y  +  bd  ,  +  (1  -  b)d 

n+l    n     n+l  n 


(7) 


Here  we  have  replaced  the  Jacobian  by  -R  times  the  identity  as  before.  The 
behavior  of  this  process  for  the  scalar  linear  problem  y'  =  ay  is  described 
by 

-1 


n+l 


n+l 


n+l 


-b 


-  [1  +  bhR] 


hR 


-1 


1  -  b 


h(a  +  R)    h(a  +  r) 


yn 

d 
n_ 

1  +  bh(a  +  R)     1  -  b  +  bh(a  +  R) 


ha 


ha  +  bhR 


*-.        — 

yn 

d 
n 

=  Sw 


(8) 


The  numerator  and  denominator  of  all  elements  of  S  are  linear  in  ha  and 
hR,  so  it  is  possible  that  S  is  power  bounded  for  large  values  of  a  (in 
particular,  up  to  values  of  order  R) .   In  fact,  that  is  the  case  if  b  ^  .5. 
The  region  of  absolute  stability  of  (7)  can  be  found  by  plotting  the  locus 
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of  values  of  ha  for  which  S  has  a  unit  eigenvalue  for  a  fixed  value  of  hi'. 
It  has  the  form  shown  in  Figure  2  provided  that  b  >■  .5. 

The  interesting  aspect  of  this  is  that  the  region  of  absolute 
stability  can  be  made  arbitrarily  large  by  choosing  R  large  enough.   This 
means  that  equation  (7)  is  a  first  order  (second  order  if  b  =  .5)  method 
that  has  a  very  large  region  of  absolute  stability.   Unfortunately,  very 

little  comes  for  free,  and  an  examination  of  the  error  coefficients  reveals 

1/2 
a  factor  of  (hR)     in  the  error  growth.   To  see  this,  apply  equation  (7) 

to  equation  (1) ,  setting  the  global  error  e   to  be  a  vector  with  components 

y  -  u(t  )  and  d   -  hu'(t  ).   A  little  computation  shows  that 
n      n       n        n 


e  ,   =  Se  +  (1  +  bhR) 
n+1     n 


+  (1  +  bhR) 


-1 


2b  -  1  -  bh(R  +  a) 

_-hR(2b  -  1)  -  h(R  +  a^ 
3b  -  1  -  bh(R  +  a) 


h2u"/2 


-1 


h3um  /6  +  OCh4)    (9) 


-hR(3b  -  1)  -  h(R  +  a) 
where  S  is  defined  in  equation  (8).   The  truncation  error,  which  is  the  last 
three  terms  in  (9) ,  is  bounded  as  R  gets  large  because  of  the  R  in  the 
denominator,  and  S  is  stable,  so  the  solution  of  (9),  which  is 

N-l 


_N-n-l      QN 
e„  =  l      b      q   +  b   e 
n       0 
n=o 


'N 


(10) 


where  q   is  the  truncation  error  in  the  n-th  step,  is  bounded.   However,  the 
n 

N 
bound  on  S   involves  a  factor  R.   This  can  be  seen  by  considering  the  form 

of  S  for  very  large  hR.   It  tends  to  the  matrix 

1    1 

0    1 


which  is  unstable.  To  find  the  growth  as  a  function  of  hR  we  must  look  at 
the  matrix  more  closely.  We  show  in  lemma  1  below  that  the  matrix  Q  which 
takes  S  to  a  diagonal  form  via  a  similarity  transformation  is  such  that 


-1 


1/2 


jQ  -||  MqII  ■-  0(hR) 
can  then  deduce  from  (10)  that 


If  we  assume  that   q    is  bounded  by  Kh  ,  we 

i  i  -ini  i 


||eN||  *  ||sN||MKh2+  ||SN 


-1 


£  (KNh  +  lie  II)  I | Q 
=  0((Nh2  +  |  |e  ||)  (hR) 


1/2 


(ID 


,-1 


LEMMA  1    If  S  is  as  given  in  equation  (8) ,  hR  is  large,  and  Q   S  Q  is 

—  1  1/2 

diagonal,  then  ||Q   |    |q|)  =  0(hR)    is  the  smallest  bound  possible. 
PROOF      The  eigenvalues  of  S  are  not  identically  one  when  hR  is  finite. 
When  hR  is  finite,  det(zl  -  S)  =  (z  -  l)2,  but  when  hR  is  finite, 
det(zl  -  S)  is  a  monic  polynomial  in  z  with  coefficients  that  are  linear 
in  (1  +  bhR)   because 


S  = 


"l 

-     — 1 

1 

+    (1  +  bhR)    1 

"b 

0 

1 

1 

[ha    ha  -  f] 


=  A  +  (  1  +  bhR)"1  B 
where  B  is  a  rank  one  matrix.   Furthermore,  since  the  constant  term  in 

det(zl  -  S)  is  det(-S)  which  is  not  identically  one,  we  can  conclude  that 

-1/2 
the  eigenvalues  of  S  are  1  +  0((hR)     )  for  large  hR.   (This  can  be  seen 

by  expanding  the  eigenvalues  as  as  power  series  in  an  arbitrary  power  of 

hR.)   The  columns  of  Q  are  scalar  multiples  of  the  eigenvectors  of  S,  and 

T 
it  is  easy  to  see  that  the  eigenvectors  of  S  have  the  form  (1    p)  , 

l/2  -1 

where  p  =  0(hR)    .   From  this  we  see  that  the  rows  of  Q   have  the  form 


(1     1/p) •   No  change  of  scale  will  avoid  the  term  1/p,  so  we  conclude 
that  ||Q_1|  I  ||q||  =  0(p)  ■  0((hR)1/2).   QED 

The  analysis  above  shows  that  if  a,  the  eigenvalue  of  the  method, 

1/2 
is  small  compared  to  R,  the  error  has  a  component  (hR)    .   However,  if  a 

is  of  the  order  of  -R,  then  a  similar  analysis  shows  that  the  error  does 

not  depend  on  such  a  term.   This  means  that  in  a  system  of  equations  with 

several  eigenvalues,  we  will  get  large  error  terms  in  the  components  with 

small  eigenvalues.   This  suggests  that  in  equation  (7)  we  should  use  a 

matrix  A  in  place  of  R  that  has  large  eigenvalues  corresponding  to  the 

large  eigenvalues  of  J,  but  small  (or  zero)  eigenvalues  corresponding  to 

reasonable  size  eigenvalues  of  J.   To  put  this  precisely,  we  can  show  that 

if  such  -an  A  were  simultaneously  diagonalizable  with  J  and  the  eigenvalues 

of  the  two  matrices  corresponded  in  this  way,  the  method  would  be  both 

1/2 
stable  and  have  an  error  term  that  did  not  include  the  (hR)     term. 

Unfortunately,  such  matrices  are  not  necessarily  "simpler"  than 

the  original  matrix  J  (by  which  we  mean  that  they  not  necessarily  easier 

to  LU  decompose).   It  appears  to  be  important  that  A  have  eigenvectors 

that  are  close  to  the  eigenvectors  of  J  corresponding  to  the  large 

eigenvalues.   (See,  for  example,  Lambert  [15].) 

5.   Higher  Order  Methods 

The  analysis  can  be  extended  to  higher  order  methods.   Consider 
the  PEC  scheme 
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p  J.  =  £ 

n+l    p 

d  .   =  -hRy  ,   +  hRp  ^  +  hf (p  .  )  (12) 

n+l      'n+l     *n+l      Vl 

y  ,   =bd  ,   +  I 
Jn+1     n+l    c 

where  E   and  Z   are  linear  combinations  of  past  values  of  y  and  hy'  for 
p      c 

the  predictor  and  corrector  respectively.   By  considering  y'  =  ay,  we  see 
that 

y  , ,  -  (1  +  bhR)"1  [Z  +  bh(R  +  a)Z  ] 
n+l  c  p 

d  ,,  =  (1  +  bhR)    [h(R  +  a)I   -  hRZ  ] 
n+l  p      c 

Some  straightforward  calculation  on  this  shows  that  if  the  value  of  ha  for 
which  an  eigenvalue  of  the  corrector  process  alone  is  y   is  written  as 
ha  =  u  (y) »  and  if  the  value  of  ha  for  which  an  eigenvalue  of  the  conven- 
tional predictor  corrector  PEC  process  is  y   is  written  as  ha  =  u   (y) »  then 

pc 

the  value  of  ha  for  which  an  eigenvalue  of  the  process  in  equation  (13)  is 
Y  can  be  written  as 

ha  =  u(Y)  =  u   (Y)  +  hR[   P°    (  >  ° ]  (14) 

c 

The  region  of  absolute  stability  of  equation  (12)  can  be  found  by  examining 

its  boundary  which  is  included  in  the  set  of  values  of  ha  given  by  equation 

i  ft 

(14)  with  y  =  e   .   For  many  pairs  of  predictors  and  correctors,  this  region 
can  be  made  arbitrarily  large  by  increasing  R. 

Unfortunately,  a  study  of  the  error  term  similar  to  that  in  the 

k/(k+l) 
previous  section  shows  that  a  factor  (hR)        appears  in  the  error  growth 

rate,  where  k  is  the  step  number  of  the  method,  so  the  higher  order  may  not 

help  too  much. 
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6.   Tentative  Conclusioas 

The  stability  of  these  methods  indicates  possible  approaches.   in 
particular,  two  avenues  seem  to  be  worth  exploring.   One  is  to  attempt  to 
use  the  ideas  of  Chebyshev  iteration  which  can  hasten  the  convergence  of 
iterative  methods  for  linear  equations  (see  Manteuffel  [18]),  and  the  other 
is  to  consider  using  different  approximations  R  to  the  Jacobian  J  on  dif- 
ferent steps.   Such  approximations  could  be  chosen  to  "beat  down"  the 
eigenvectors  corresponding  to  the  large  unwanted  components  in  the  Jacobian 
in  the  spirit  of  Lambert  [15].   In  fact,  the  alternating  direction  method 
is  a  form  of  this  (see  Richtmeyer  and  Morton  [21],  p.  176).   In  the  ADI 
method,  an  approximation  is  chosen  that  has  the  same  eigenvectors  as  J,  and 
alternately  damps  the  large  eigenvectors  corresponding  to  the  x  and  y 
spatial  directions. 
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